Amendments to the Claims 



Claim 1 (currently amended): A system for disambiguating speech input using one of 
voice mode interaction, visual mode interaction, or a combination of voice mode 
interaction and visual mode interaction with an application comprising: 

a speech disambiguation mechanism : 

an options and parameters component for receiving user parameters and application 
parameters for controlling the speech disambiguation mechanism, wherein the speech 
disambiguation mechanism is controlled by parameters set by the user and parameters 
set by the application, and wherein the parameters include confidence thresholds 
governing unambiguous recognition and close matches; 

a speech recognition component that receives recorded audio, speech or a 
combination of the recorded audio and the speech input and generates: 

one or more tokens corresponding to the speech input 

a plurality of tokens corresponding to disambiguated words for presentation to the 
user ; and 

for each of the one or more tokens, a confidence value indicative of the likelihood 
that a given token correctly represents the speech input; 

a selection component that identifies, according to a selection algorithm, which two 
or more of the tokens to be presented to the user; tokens are to be presented to a user 
as alternatives,, wherein said alternatives are words or tokens ; 

one or more disambiguation components that present the alternatives to the user in 
one of voice mode, visual mode, or a combination of the voice mode and the visual 
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mode, and receive receives an alternative selected by the user in one of the voice 
mode, the visual mode, or a combination of the voice mode and the visual mode^_and- 

an output interface that presents the selected alternative without translation of the 
speech input to [[an]] the application as input. 

Claim 2 (cancelled): 

Claim 3 (cancelled): 

Claim 4 (original): The system of claim 1, wherein the one or more disambiguation 
components perform said interaction by presenting the user with alternatives in a visual 
mode, and by receiving the user's selection in a visual mode. 

Claim 5 (original): The system of claim 4, wherein the disambiguation components 
present the alternatives to the user in a visual form and allow the user to select from 
among the alternatives using a voice input. 

Claim 6 (cancelled): 

Claim 7 (original): The system of claim 1, wherein the selection component filters the 
one or more tokens according to a set of parameters. 

Claim 8 (original): The system of claim 7, wherein the set of parameters is user specified. 
Claim 9 (cancelled): 
Claim 10 (cancelled): 
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Claim 11 (currently amended): A method of processing speech input using one of voice 
mode interaction, visual mode interaction, or a combination of voice mode and visual 
mode interaction with an application comprising: 

a speech disambiguation mechanism : 

receiving user parameters and application parameters for controlling the speech 
disambiguation mechanism, wherein both the user and the application can set the 
parameters to control said speech disambiguation mechanism, and wherein the 
parameters include confidence thresholds governing unambiguous recognition 
and close matches: 

receiving a speech input from [[a]] the user; 
determining whether the speech input is ambiguous; 

if the speech input is not ambiguous, communicating a token representative of the 
speech input to an application as input to the application; and 

if the speech input is ambiguous; 

selecting two or more tokens and presenting the tokens as alternatives te 
be presented to the user as alternatives, wherein said alternatives are words 

presenting the alternatives to the user in one of voice mode, visual mode, 
or a combination of the_ voice mode and the visual mode, and receiving a 
selection of an alternative from the user from the plurality of alternatives 
presented to the user in one of the voice mode, the visual mode, or a 
combination of the voice mode and the visual mode; and[[,]] 



4 



communicating the selected alternative without translation of the speech 
input to the application as input to the application. 

Claim 12 (original): The method of claim 11, where the interaction comprises the 
concurrent use of said visual mode and said voice mode. 

Claim 13 (original): The method of claim 12, wherein the interaction comprises the user 
selecting from among the plural alternatives using a combination of speech and visual- 
based input. 

Claim 14 (original): The method of claim 11, wherein the interaction comprises the user 
selecting from among the plural alternatives using visual input. 
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